MQJoin: Efficient Shared Execution of Main-Memory Joins
نویسندگان
چکیده
Database architectures typically process queries one-at-a-time, executing concurrent queries in independent execution contexts. Often, such a design leads to unpredictable performance and poor scalability. One approach to circumvent the problem is to take advantage of sharing opportunities across concurrently running queries. In this paper we propose Many-Query Join (MQJoin), a novel method for sharing the execution of a join that can efficiently deal with hundreds of concurrent queries. This is achieved by minimizing redundant work and making efficient use of mainmemory bandwidth and multi-core architectures. Compared to existing proposals, MQJoin is able to efficiently handle larger workloads regardless of the schema by exploiting more sharing opportunities. We also compared MQJoin to two commercial mainmemory column-store databases. For a TPC-H based workload, we show that MQJoin provides 2-5x higher throughput with significantly more stable response times.
منابع مشابه
Fast similarity join for multi-dimensional data
To appear in Information Systems Journal, Elsevier, 2005 The efficient processing of multidimensional similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focused on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memor...
متن کاملExecution replay for an MPI-based multi-threaded runtime system
In this paper we present an execution replay system for Athapascan, an MPI-based multi-threaded runtime system. The main challenge of this work was to deal with nondeterministic features of MPI promiscuous communications and varying number of test functions without compromising the efficiency of an existing solution for execution replay of shared memory thread based programs. Novel solutions we...
متن کاملPartitioning Inverted Lists for Efficient Evaluation of Set-Containment Joins in Main Memory
We present an algorithm for efficient processing of set-containment joins in main memory. Our algorithm uses an index structure based on inverted files. We focus on improving performance of the algorithm in a main-memory environment by utilizing the L2 CPU cache more efficiently. To achieve this, we employ some optimizations including partitioning the inverted lists and compressing the intermed...
متن کاملGPU processing of theta-joins
The GPGPU paradigm has been recently employed to accelerate the processing of big amounts of data through the utilization of the massive parallelism offered by modern GPUs. To date, several techniques have been proposed for the implementation of simple select, aggregate and equality join operations on GPUs. In this paper, we study the efficient implementation of theta-join queries between two r...
متن کاملMemory Efficient Processing of DNA Sequences in Relational Main-Memory Database Systems
Pipeline breaking operators such as aggregations or joins require database systems to materialize intermediate results. In case that the database system exceeds main memory capacities due to large intermediate results, main-memory database systems experience a massive performance degradation due to paging or even abort queries. In our current research on efficiently analyzing DNA sequencing dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 9 شماره
صفحات -
تاریخ انتشار 2016